Skip to content

Conversation

@fivetran-felixhuang
Copy link
Collaborator

@fivetran-felixhuang fivetran-felixhuang commented Jan 5, 2026

https://docs.snowflake.com/en/sql-reference/functions/to_double

DuckDB's CAST operation is able to parse strings into numbers, as long as they only contain numeric characters with special characters like '.', '+' or '-'.

For the transpilation of TO_DOUBLE to work, we need to remove unsupported characters from the input strings using REGEXP_REPLACE. Also if the the appears at the end, we need to move it to the front if it's '-' or otherwise ignore it if it's '+'

Test queries

source query

   SELECT
      TO_DOUBLE('123.45') AS from_string,
      TO_DOUBLE(456) AS from_integer,
      TO_DOUBLE(789.123) AS from_decimal,
      TO_DOUBLE(NULL) AS from_null,
      TO_DOUBLE('$1,234.56', '$9,999.99') AS currency_dollar,
      TO_DOUBLE('-$987.65', 'S$999.99') AS signed_currency,
      TO_DOUBLE('1.23E+02', '9.99EEEE') AS scientific,
      TO_DOUBLE('-4.56E-03', 'S9.99EEEE') AS signed_scientific,
      TO_DOUBLE('-123.45', 'S999.99') AS explicit_sign,
      TO_DOUBLE('123.45-', '999.99MI') AS trailing_minus,
      TO_DOUBLE('-  12345', 'MI9999999') AS mid_minus,
      TO_DOUBLE('  12345', 'MI9999999') AS mid_minus_2,
      TO_DOUBLE('123.45+', '999.99S') AS trailing_plus,
      TO_DOUBLE('00123.45', '00000.00') AS leading_zeros,
      TO_DOUBLE('  123.45', '99999.99') AS padded_spaces,
      TO_DOUBLE('123.45+ ', '999.99SB') AS trailing_plus_with_space,
      TO_DOUBLE('1,234,567.89', '9,999,999.99') AS thousands_comma,
      TO_DOUBLE('inf') AS positive_infinity,
      TO_DOUBLE('-inf') AS negative_infinity,
      TO_DOUBLE('nan') AS not_a_number,
      TO_DOUBLE(CONCAT('1640', '99', '5200')) AS concat_num,
      TO_DOUBLE('111,123.45', '999G999D99') AS d_format,
      TO_DOUBLE(' 0', 'B9') AS blank_with_space,
      TO_DOUBLE(PARSE_JSON('123.45')) AS from_variant_number,
      TO_DOUBLE(PARSE_JSON('"456.78"')) AS from_variant_string,
      TO_DOUBLE(PARSE_JSON('1.7976931348623157e+308')) AS variant_large_float,
      TO_DOUBLE(PARSE_JSON('true')) AS from_variant_boolean,
      TO_DOUBLE(PARSE_JSON('null')) AS from_variant_null;

transpiled query:

SELECT 
CAST('123.45' AS DOUBLE) AS from_string, 
CAST(456 AS DOUBLE) AS from_integer, 
CAST(789.123 AS DOUBLE) AS from_decimal, 
CAST(NULL AS DOUBLE) AS from_null, 
CAST(REGEXP_REPLACE('$1,234.56', '[,$\s]', '', 'g') AS DOUBLE) AS currency_dollar, 
CAST(CASE WHEN RIGHT(REGEXP_REPLACE('-$987.65', '[,$\s]', '', 'g'), 1) = '-' THEN '-' || RTRIM(REGEXP_REPLACE('-$987.65', '[,$\s]', '', 'g'), '-') WHEN RIGHT(REGEXP_REPLACE('-$987.65', '[,$\s]', '', 'g'), 1) = '+' THEN RTRIM(REGEXP_REPLACE('-$987.65', '[,$\s]', '', 'g'), '+') ELSE REGEXP_REPLACE('-$987.65', '[,$\s]', '', 'g') END AS DOUBLE) AS signed_currency, 
CAST(REGEXP_REPLACE('1.23E+02', '[,$\s]', '', 'g') AS DOUBLE) AS scientific, 
CAST(CASE WHEN RIGHT(REGEXP_REPLACE('-4.56E-03', '[,$\s]', '', 'g'), 1) = '-' THEN '-' || RTRIM(REGEXP_REPLACE('-4.56E-03', '[,$\s]', '', 'g'), '-') WHEN RIGHT(REGEXP_REPLACE('-4.56E-03', '[,$\s]', '', 'g'), 1) = '+' THEN RTRIM(REGEXP_REPLACE('-4.56E-03', '[,$\s]', '', 'g'), '+') ELSE REGEXP_REPLACE('-4.56E-03', '[,$\s]', '', 'g') END AS DOUBLE) AS signed_scientific, 
CAST(CASE WHEN RIGHT(REGEXP_REPLACE('-123.45', '[,$\s]', '', 'g'), 1) = '-' THEN '-' || RTRIM(REGEXP_REPLACE('-123.45', '[,$\s]', '', 'g'), '-') WHEN RIGHT(REGEXP_REPLACE('-123.45', '[,$\s]', '', 'g'), 1) = '+' THEN RTRIM(REGEXP_REPLACE('-123.45', '[,$\s]', '', 'g'), '+') ELSE REGEXP_REPLACE('-123.45', '[,$\s]', '', 'g') END AS DOUBLE) AS explicit_sign, 
CAST(CASE WHEN RIGHT(REGEXP_REPLACE('123.45-', '[,$\s]', '', 'g'), 1) = '-' THEN '-' || RTRIM(REGEXP_REPLACE('123.45-', '[,$\s]', '', 'g'), '-') WHEN RIGHT(REGEXP_REPLACE('123.45-', '[,$\s]', '', 'g'), 1) = '+' THEN RTRIM(REGEXP_REPLACE('123.45-', '[,$\s]', '', 'g'), '+') ELSE REGEXP_REPLACE('123.45-', '[,$\s]', '', 'g') END AS DOUBLE) AS trailing_minus, 
CAST(CASE WHEN RIGHT(REGEXP_REPLACE('-  12345', '[,$\s]', '', 'g'), 1) = '-' THEN '-' || RTRIM(REGEXP_REPLACE('-  12345', '[,$\s]', '', 'g'), '-') WHEN RIGHT(REGEXP_REPLACE('-  12345', '[,$\s]', '', 'g'), 1) = '+' THEN RTRIM(REGEXP_REPLACE('-  12345', '[,$\s]', '', 'g'), '+') ELSE REGEXP_REPLACE('-  12345', '[,$\s]', '', 'g') END AS DOUBLE) AS mid_minus, 
CAST(CASE WHEN RIGHT(REGEXP_REPLACE('  12345', '[,$\s]', '', 'g'), 1) = '-' THEN '-' || RTRIM(REGEXP_REPLACE('  12345', '[,$\s]', '', 'g'), '-') WHEN RIGHT(REGEXP_REPLACE('  12345', '[,$\s]', '', 'g'), 1) = '+' THEN RTRIM(REGEXP_REPLACE('  12345', '[,$\s]', '', 'g'), '+') ELSE REGEXP_REPLACE('  12345', '[,$\s]', '', 'g') END AS DOUBLE) AS mid_minus_2, 
CAST(CASE WHEN RIGHT(REGEXP_REPLACE('123.45+', '[,$\s]', '', 'g'), 1) = '-' THEN '-' || RTRIM(REGEXP_REPLACE('123.45+', '[,$\s]', '', 'g'), '-') WHEN RIGHT(REGEXP_REPLACE('123.45+', '[,$\s]', '', 'g'), 1) = '+' THEN RTRIM(REGEXP_REPLACE('123.45+', '[,$\s]', '', 'g'), '+') ELSE REGEXP_REPLACE('123.45+', '[,$\s]', '', 'g') END AS DOUBLE) AS trailing_plus, 
CAST(REGEXP_REPLACE('00123.45', '[,$\s]', '', 'g') AS DOUBLE) AS leading_zeros, 
CAST(REGEXP_REPLACE('  123.45', '[,$\s]', '', 'g') AS DOUBLE) AS padded_spaces, 
CAST(CASE WHEN RIGHT(REGEXP_REPLACE('123.45+ ', '[,$\s]', '', 'g'), 1) = '-' THEN '-' || RTRIM(REGEXP_REPLACE('123.45+ ', '[,$\s]', '', 'g'), '-') WHEN RIGHT(REGEXP_REPLACE('123.45+ ', '[,$\s]', '', 'g'), 1) = '+' THEN RTRIM(REGEXP_REPLACE('123.45+ ', '[,$\s]', '', 'g'), '+') ELSE REGEXP_REPLACE('123.45+ ', '[,$\s]', '', 'g') END AS DOUBLE) AS trailing_plus_with_space, 
CAST(REGEXP_REPLACE('1,234,567.89', '[,$\s]', '', 'g') AS DOUBLE) AS thousands_comma, 
CAST('inf' AS DOUBLE) AS positive_infinity, 
CAST('-inf' AS DOUBLE) AS negative_infinity, 
CAST('nan' AS DOUBLE) AS not_a_number, 
CAST('1640' || '99' || '5200' AS DOUBLE) AS concat_num, 
CAST(REGEXP_REPLACE('111,123.45', '[,$\s]', '', 'g') AS DOUBLE) AS d_format, 
CAST(REGEXP_REPLACE(' 0', '[,$\s]', '', 'g') AS DOUBLE) AS blank_with_space, 
CAST(JSON('123.45') AS DOUBLE) AS from_variant_number, 
CAST(JSON('"456.78"') AS DOUBLE) AS from_variant_string, 
CAST(JSON('1.7976931348623157e+308') AS DOUBLE) AS variant_large_float, 
CAST(JSON('true') AS DOUBLE) AS from_variant_boolean, 
CAST(JSON('null') AS DOUBLE) AS from_variant_null

Both queries produce the same results. It should be noted that with Snowflake CAST(JSON('1.7976931348623157e+308') AS DOUBLE) loses precision and becomes 1.79769313486232e+308

@fivetran-felixhuang fivetran-felixhuang self-assigned this Jan 5, 2026
@fivetran-felixhuang fivetran-felixhuang marked this pull request as ready for review January 5, 2026 16:54
@github-actions
Copy link
Contributor

github-actions bot commented Jan 5, 2026

SQLGlot Integration Test Results

Comparing:

  • this branch (sqlglot:transpile_TO_DOUBLE_snowflake_duckdb, sqlglot version: transpile_TO_DOUBLE_snowflake_duckdb)
  • baseline (main, sqlglot version: 28.5.1.dev59)

⚠️ Limited to dialects: snowflake, duckdb

By Dialect

dialect main sqlglot:transpile_TO_DOUBLE_snowflake_duckdb difference links
duckdb -> duckdb 4003/4003 passed (100.0%) 4003/4003 passed (100.0%) No change full result / delta
snowflake -> duckdb 626/1085 passed (57.7%) 628/1085 passed (57.9%) ⬆ improved by 0.2% full result / delta
snowflake -> snowflake 981/1085 passed (90.4%) 981/1085 passed (90.4%) No change full result / delta

Overall

main: 6173 total, 5610 passed (pass rate: 90.9%), sqlglot version: 28.5.1.dev59

sqlglot:transpile_TO_DOUBLE_snowflake_duckdb: 6173 total, 5612 passed (pass rate: 90.9%), sqlglot version: transpile_TO_DOUBLE_snowflake_duckdb

Difference: No change

@fivetran-BradfordPaskewitz
Copy link
Collaborator

https://docs.snowflake.com/en/sql-reference/functions/to_double

DuckDB's CAST operation is able to parse strings into numbers, as long as they only contain numeric characters with special characters like '.', '+' or '-'.

For the transpilation of TO_DOUBLE to work, we need to remove unsupported characters from the input strings using REGEXP_REPLACE. Also if the the appears at the end, we need to move it to the front if it's '-' or otherwise ignore it if it's '+'

Test queries

source query

   SELECT
      TO_DOUBLE('123.45') AS from_string,
      TO_DOUBLE(456) AS from_integer,
      TO_DOUBLE(789.123) AS from_decimal,
      TO_DOUBLE(NULL) AS from_null,
      TO_DOUBLE('$1,234.56', '$9,999.99') AS currency_dollar,
      TO_DOUBLE('-$987.65', 'S$999.99') AS signed_currency,
      TO_DOUBLE('1.23E+02', '9.99EEEE') AS scientific,
      TO_DOUBLE('-4.56E-03', 'S9.99EEEE') AS signed_scientific,
      TO_DOUBLE('-123.45', 'S999.99') AS explicit_sign,
      TO_DOUBLE('123.45-', '999.99MI') AS trailing_minus,
      TO_DOUBLE('-  12345', 'MI9999999') AS mid_minus,
      TO_DOUBLE('  12345', 'MI9999999') AS mid_minus_2,
      TO_DOUBLE('123.45+', '999.99S') AS trailing_plus,
      TO_DOUBLE('00123.45', '00000.00') AS leading_zeros,
      TO_DOUBLE('  123.45', '99999.99') AS padded_spaces,
      TO_DOUBLE('123.45+ ', '999.99SB') AS trailing_plus_with_space,
      TO_DOUBLE('1,234,567.89', '9,999,999.99') AS thousands_comma,
      TO_DOUBLE('inf') AS positive_infinity,
      TO_DOUBLE('-inf') AS negative_infinity,
      TO_DOUBLE('nan') AS not_a_number,
      TO_DOUBLE(CONCAT('1640', '99', '5200')) AS concat_num,
      TO_DOUBLE('111,123.45', '999G999D99') AS d_format,
      TO_DOUBLE(' 0', 'B9') AS blank_with_space,
      TO_DOUBLE(PARSE_JSON('123.45')) AS from_variant_number,
      TO_DOUBLE(PARSE_JSON('"456.78"')) AS from_variant_string,
      TO_DOUBLE(PARSE_JSON('1.7976931348623157e+308')) AS variant_large_float,
      TO_DOUBLE(PARSE_JSON('true')) AS from_variant_boolean,
      TO_DOUBLE(PARSE_JSON('null')) AS from_variant_null;

transpiled query:

SELECT 
CAST('123.45' AS DOUBLE) AS from_string, 
CAST(456 AS DOUBLE) AS from_integer, 
CAST(789.123 AS DOUBLE) AS from_decimal, 
CAST(NULL AS DOUBLE) AS from_null, 
CAST(REGEXP_REPLACE('$1,234.56', '[,$\s]', '', 'g') AS DOUBLE) AS currency_dollar, 
CAST(CASE WHEN RIGHT(REGEXP_REPLACE('-$987.65', '[,$\s]', '', 'g'), 1) = '-' THEN '-' || RTRIM(REGEXP_REPLACE('-$987.65', '[,$\s]', '', 'g'), '-') WHEN RIGHT(REGEXP_REPLACE('-$987.65', '[,$\s]', '', 'g'), 1) = '+' THEN RTRIM(REGEXP_REPLACE('-$987.65', '[,$\s]', '', 'g'), '+') ELSE REGEXP_REPLACE('-$987.65', '[,$\s]', '', 'g') END AS DOUBLE) AS signed_currency, 
CAST(REGEXP_REPLACE('1.23E+02', '[,$\s]', '', 'g') AS DOUBLE) AS scientific, 
CAST(CASE WHEN RIGHT(REGEXP_REPLACE('-4.56E-03', '[,$\s]', '', 'g'), 1) = '-' THEN '-' || RTRIM(REGEXP_REPLACE('-4.56E-03', '[,$\s]', '', 'g'), '-') WHEN RIGHT(REGEXP_REPLACE('-4.56E-03', '[,$\s]', '', 'g'), 1) = '+' THEN RTRIM(REGEXP_REPLACE('-4.56E-03', '[,$\s]', '', 'g'), '+') ELSE REGEXP_REPLACE('-4.56E-03', '[,$\s]', '', 'g') END AS DOUBLE) AS signed_scientific, 
CAST(CASE WHEN RIGHT(REGEXP_REPLACE('-123.45', '[,$\s]', '', 'g'), 1) = '-' THEN '-' || RTRIM(REGEXP_REPLACE('-123.45', '[,$\s]', '', 'g'), '-') WHEN RIGHT(REGEXP_REPLACE('-123.45', '[,$\s]', '', 'g'), 1) = '+' THEN RTRIM(REGEXP_REPLACE('-123.45', '[,$\s]', '', 'g'), '+') ELSE REGEXP_REPLACE('-123.45', '[,$\s]', '', 'g') END AS DOUBLE) AS explicit_sign, 
CAST(CASE WHEN RIGHT(REGEXP_REPLACE('123.45-', '[,$\s]', '', 'g'), 1) = '-' THEN '-' || RTRIM(REGEXP_REPLACE('123.45-', '[,$\s]', '', 'g'), '-') WHEN RIGHT(REGEXP_REPLACE('123.45-', '[,$\s]', '', 'g'), 1) = '+' THEN RTRIM(REGEXP_REPLACE('123.45-', '[,$\s]', '', 'g'), '+') ELSE REGEXP_REPLACE('123.45-', '[,$\s]', '', 'g') END AS DOUBLE) AS trailing_minus, 
CAST(CASE WHEN RIGHT(REGEXP_REPLACE('-  12345', '[,$\s]', '', 'g'), 1) = '-' THEN '-' || RTRIM(REGEXP_REPLACE('-  12345', '[,$\s]', '', 'g'), '-') WHEN RIGHT(REGEXP_REPLACE('-  12345', '[,$\s]', '', 'g'), 1) = '+' THEN RTRIM(REGEXP_REPLACE('-  12345', '[,$\s]', '', 'g'), '+') ELSE REGEXP_REPLACE('-  12345', '[,$\s]', '', 'g') END AS DOUBLE) AS mid_minus, 
CAST(CASE WHEN RIGHT(REGEXP_REPLACE('  12345', '[,$\s]', '', 'g'), 1) = '-' THEN '-' || RTRIM(REGEXP_REPLACE('  12345', '[,$\s]', '', 'g'), '-') WHEN RIGHT(REGEXP_REPLACE('  12345', '[,$\s]', '', 'g'), 1) = '+' THEN RTRIM(REGEXP_REPLACE('  12345', '[,$\s]', '', 'g'), '+') ELSE REGEXP_REPLACE('  12345', '[,$\s]', '', 'g') END AS DOUBLE) AS mid_minus_2, 
CAST(CASE WHEN RIGHT(REGEXP_REPLACE('123.45+', '[,$\s]', '', 'g'), 1) = '-' THEN '-' || RTRIM(REGEXP_REPLACE('123.45+', '[,$\s]', '', 'g'), '-') WHEN RIGHT(REGEXP_REPLACE('123.45+', '[,$\s]', '', 'g'), 1) = '+' THEN RTRIM(REGEXP_REPLACE('123.45+', '[,$\s]', '', 'g'), '+') ELSE REGEXP_REPLACE('123.45+', '[,$\s]', '', 'g') END AS DOUBLE) AS trailing_plus, 
CAST(REGEXP_REPLACE('00123.45', '[,$\s]', '', 'g') AS DOUBLE) AS leading_zeros, 
CAST(REGEXP_REPLACE('  123.45', '[,$\s]', '', 'g') AS DOUBLE) AS padded_spaces, 
CAST(CASE WHEN RIGHT(REGEXP_REPLACE('123.45+ ', '[,$\s]', '', 'g'), 1) = '-' THEN '-' || RTRIM(REGEXP_REPLACE('123.45+ ', '[,$\s]', '', 'g'), '-') WHEN RIGHT(REGEXP_REPLACE('123.45+ ', '[,$\s]', '', 'g'), 1) = '+' THEN RTRIM(REGEXP_REPLACE('123.45+ ', '[,$\s]', '', 'g'), '+') ELSE REGEXP_REPLACE('123.45+ ', '[,$\s]', '', 'g') END AS DOUBLE) AS trailing_plus_with_space, 
CAST(REGEXP_REPLACE('1,234,567.89', '[,$\s]', '', 'g') AS DOUBLE) AS thousands_comma, 
CAST('inf' AS DOUBLE) AS positive_infinity, 
CAST('-inf' AS DOUBLE) AS negative_infinity, 
CAST('nan' AS DOUBLE) AS not_a_number, 
CAST('1640' || '99' || '5200' AS DOUBLE) AS concat_num, 
CAST(REGEXP_REPLACE('111,123.45', '[,$\s]', '', 'g') AS DOUBLE) AS d_format, 
CAST(REGEXP_REPLACE(' 0', '[,$\s]', '', 'g') AS DOUBLE) AS blank_with_space, 
CAST(JSON('123.45') AS DOUBLE) AS from_variant_number, 
CAST(JSON('"456.78"') AS DOUBLE) AS from_variant_string, 
CAST(JSON('1.7976931348623157e+308') AS DOUBLE) AS variant_large_float, 
CAST(JSON('true') AS DOUBLE) AS from_variant_boolean, 
CAST(JSON('null') AS DOUBLE) AS from_variant_null

Both queries produce the same results. It should be noted that with Snowflake CAST(JSON('1.7976931348623157e+308') AS DOUBLE) loses precision and becomes 1.79769313486232e+308

Should we really be replacing unsupported characters? Would it be more explicit to just fail for those cases?

@fivetran-felixhuang
Copy link
Collaborator Author

https://docs.snowflake.com/en/sql-reference/functions/to_double
DuckDB's CAST operation is able to parse strings into numbers, as long as they only contain numeric characters with special characters like '.', '+' or '-'.
For the transpilation of TO_DOUBLE to work, we need to remove unsupported characters from the input strings using REGEXP_REPLACE. Also if the the appears at the end, we need to move it to the front if it's '-' or otherwise ignore it if it's '+'
Test queries
source query

   SELECT
      TO_DOUBLE('123.45') AS from_string,
      TO_DOUBLE(456) AS from_integer,
      TO_DOUBLE(789.123) AS from_decimal,
      TO_DOUBLE(NULL) AS from_null,
      TO_DOUBLE('$1,234.56', '$9,999.99') AS currency_dollar,
      TO_DOUBLE('-$987.65', 'S$999.99') AS signed_currency,
      TO_DOUBLE('1.23E+02', '9.99EEEE') AS scientific,
      TO_DOUBLE('-4.56E-03', 'S9.99EEEE') AS signed_scientific,
      TO_DOUBLE('-123.45', 'S999.99') AS explicit_sign,
      TO_DOUBLE('123.45-', '999.99MI') AS trailing_minus,
      TO_DOUBLE('-  12345', 'MI9999999') AS mid_minus,
      TO_DOUBLE('  12345', 'MI9999999') AS mid_minus_2,
      TO_DOUBLE('123.45+', '999.99S') AS trailing_plus,
      TO_DOUBLE('00123.45', '00000.00') AS leading_zeros,
      TO_DOUBLE('  123.45', '99999.99') AS padded_spaces,
      TO_DOUBLE('123.45+ ', '999.99SB') AS trailing_plus_with_space,
      TO_DOUBLE('1,234,567.89', '9,999,999.99') AS thousands_comma,
      TO_DOUBLE('inf') AS positive_infinity,
      TO_DOUBLE('-inf') AS negative_infinity,
      TO_DOUBLE('nan') AS not_a_number,
      TO_DOUBLE(CONCAT('1640', '99', '5200')) AS concat_num,
      TO_DOUBLE('111,123.45', '999G999D99') AS d_format,
      TO_DOUBLE(' 0', 'B9') AS blank_with_space,
      TO_DOUBLE(PARSE_JSON('123.45')) AS from_variant_number,
      TO_DOUBLE(PARSE_JSON('"456.78"')) AS from_variant_string,
      TO_DOUBLE(PARSE_JSON('1.7976931348623157e+308')) AS variant_large_float,
      TO_DOUBLE(PARSE_JSON('true')) AS from_variant_boolean,
      TO_DOUBLE(PARSE_JSON('null')) AS from_variant_null;

transpiled query:

SELECT 
CAST('123.45' AS DOUBLE) AS from_string, 
CAST(456 AS DOUBLE) AS from_integer, 
CAST(789.123 AS DOUBLE) AS from_decimal, 
CAST(NULL AS DOUBLE) AS from_null, 
CAST(REGEXP_REPLACE('$1,234.56', '[,$\s]', '', 'g') AS DOUBLE) AS currency_dollar, 
CAST(CASE WHEN RIGHT(REGEXP_REPLACE('-$987.65', '[,$\s]', '', 'g'), 1) = '-' THEN '-' || RTRIM(REGEXP_REPLACE('-$987.65', '[,$\s]', '', 'g'), '-') WHEN RIGHT(REGEXP_REPLACE('-$987.65', '[,$\s]', '', 'g'), 1) = '+' THEN RTRIM(REGEXP_REPLACE('-$987.65', '[,$\s]', '', 'g'), '+') ELSE REGEXP_REPLACE('-$987.65', '[,$\s]', '', 'g') END AS DOUBLE) AS signed_currency, 
CAST(REGEXP_REPLACE('1.23E+02', '[,$\s]', '', 'g') AS DOUBLE) AS scientific, 
CAST(CASE WHEN RIGHT(REGEXP_REPLACE('-4.56E-03', '[,$\s]', '', 'g'), 1) = '-' THEN '-' || RTRIM(REGEXP_REPLACE('-4.56E-03', '[,$\s]', '', 'g'), '-') WHEN RIGHT(REGEXP_REPLACE('-4.56E-03', '[,$\s]', '', 'g'), 1) = '+' THEN RTRIM(REGEXP_REPLACE('-4.56E-03', '[,$\s]', '', 'g'), '+') ELSE REGEXP_REPLACE('-4.56E-03', '[,$\s]', '', 'g') END AS DOUBLE) AS signed_scientific, 
CAST(CASE WHEN RIGHT(REGEXP_REPLACE('-123.45', '[,$\s]', '', 'g'), 1) = '-' THEN '-' || RTRIM(REGEXP_REPLACE('-123.45', '[,$\s]', '', 'g'), '-') WHEN RIGHT(REGEXP_REPLACE('-123.45', '[,$\s]', '', 'g'), 1) = '+' THEN RTRIM(REGEXP_REPLACE('-123.45', '[,$\s]', '', 'g'), '+') ELSE REGEXP_REPLACE('-123.45', '[,$\s]', '', 'g') END AS DOUBLE) AS explicit_sign, 
CAST(CASE WHEN RIGHT(REGEXP_REPLACE('123.45-', '[,$\s]', '', 'g'), 1) = '-' THEN '-' || RTRIM(REGEXP_REPLACE('123.45-', '[,$\s]', '', 'g'), '-') WHEN RIGHT(REGEXP_REPLACE('123.45-', '[,$\s]', '', 'g'), 1) = '+' THEN RTRIM(REGEXP_REPLACE('123.45-', '[,$\s]', '', 'g'), '+') ELSE REGEXP_REPLACE('123.45-', '[,$\s]', '', 'g') END AS DOUBLE) AS trailing_minus, 
CAST(CASE WHEN RIGHT(REGEXP_REPLACE('-  12345', '[,$\s]', '', 'g'), 1) = '-' THEN '-' || RTRIM(REGEXP_REPLACE('-  12345', '[,$\s]', '', 'g'), '-') WHEN RIGHT(REGEXP_REPLACE('-  12345', '[,$\s]', '', 'g'), 1) = '+' THEN RTRIM(REGEXP_REPLACE('-  12345', '[,$\s]', '', 'g'), '+') ELSE REGEXP_REPLACE('-  12345', '[,$\s]', '', 'g') END AS DOUBLE) AS mid_minus, 
CAST(CASE WHEN RIGHT(REGEXP_REPLACE('  12345', '[,$\s]', '', 'g'), 1) = '-' THEN '-' || RTRIM(REGEXP_REPLACE('  12345', '[,$\s]', '', 'g'), '-') WHEN RIGHT(REGEXP_REPLACE('  12345', '[,$\s]', '', 'g'), 1) = '+' THEN RTRIM(REGEXP_REPLACE('  12345', '[,$\s]', '', 'g'), '+') ELSE REGEXP_REPLACE('  12345', '[,$\s]', '', 'g') END AS DOUBLE) AS mid_minus_2, 
CAST(CASE WHEN RIGHT(REGEXP_REPLACE('123.45+', '[,$\s]', '', 'g'), 1) = '-' THEN '-' || RTRIM(REGEXP_REPLACE('123.45+', '[,$\s]', '', 'g'), '-') WHEN RIGHT(REGEXP_REPLACE('123.45+', '[,$\s]', '', 'g'), 1) = '+' THEN RTRIM(REGEXP_REPLACE('123.45+', '[,$\s]', '', 'g'), '+') ELSE REGEXP_REPLACE('123.45+', '[,$\s]', '', 'g') END AS DOUBLE) AS trailing_plus, 
CAST(REGEXP_REPLACE('00123.45', '[,$\s]', '', 'g') AS DOUBLE) AS leading_zeros, 
CAST(REGEXP_REPLACE('  123.45', '[,$\s]', '', 'g') AS DOUBLE) AS padded_spaces, 
CAST(CASE WHEN RIGHT(REGEXP_REPLACE('123.45+ ', '[,$\s]', '', 'g'), 1) = '-' THEN '-' || RTRIM(REGEXP_REPLACE('123.45+ ', '[,$\s]', '', 'g'), '-') WHEN RIGHT(REGEXP_REPLACE('123.45+ ', '[,$\s]', '', 'g'), 1) = '+' THEN RTRIM(REGEXP_REPLACE('123.45+ ', '[,$\s]', '', 'g'), '+') ELSE REGEXP_REPLACE('123.45+ ', '[,$\s]', '', 'g') END AS DOUBLE) AS trailing_plus_with_space, 
CAST(REGEXP_REPLACE('1,234,567.89', '[,$\s]', '', 'g') AS DOUBLE) AS thousands_comma, 
CAST('inf' AS DOUBLE) AS positive_infinity, 
CAST('-inf' AS DOUBLE) AS negative_infinity, 
CAST('nan' AS DOUBLE) AS not_a_number, 
CAST('1640' || '99' || '5200' AS DOUBLE) AS concat_num, 
CAST(REGEXP_REPLACE('111,123.45', '[,$\s]', '', 'g') AS DOUBLE) AS d_format, 
CAST(REGEXP_REPLACE(' 0', '[,$\s]', '', 'g') AS DOUBLE) AS blank_with_space, 
CAST(JSON('123.45') AS DOUBLE) AS from_variant_number, 
CAST(JSON('"456.78"') AS DOUBLE) AS from_variant_string, 
CAST(JSON('1.7976931348623157e+308') AS DOUBLE) AS variant_large_float, 
CAST(JSON('true') AS DOUBLE) AS from_variant_boolean, 
CAST(JSON('null') AS DOUBLE) AS from_variant_null

Both queries produce the same results. It should be noted that with Snowflake CAST(JSON('1.7976931348623157e+308') AS DOUBLE) loses precision and becomes 1.79769313486232e+308

Should we really be replacing unsupported characters? Would it be more explicit to just fail for those cases?

To be more specific, although these characters are not supported by DuckDB's REGEXP_REPLACE, they are supported by Snowflake as they are parts of the formatted numeric strings. For example, the '$' and ',' in this string '$1,234.56' are valid if the specified format is '$9,999.99' (TO_DOUBLE('$1,234.56', '$9,999.99')). So in the process of transpilation, we need to remove them

value = expression.this
format_arg = expression.args.get("format")

if format_arg and isinstance(format_arg, exp.Literal):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this branch handle every format supported by Snowflake? It's quite complicated and so I'd like to understand if the complexity stems from it being a complete solution or just hacking it together.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://docs.snowflake.com/en/sql-reference/sql-format-models

This can handle the formats specified under the "Fixed-position numeric formats" section, except for Hexadecimal digit (I couldn't find a working example of it and thus can't test it). These formats are included in the test query shown above.

There are also Text-minimal numeric formats, and as far as I know they are just the default format for DOUBLE and DECIMAL

@georgesittas
Copy link
Collaborator

I'm concerned with getting this in, because it feels like we're addressing a small subdomain of the possible model formats and the added complexity to achieve this doesn't look trivial.

Let's postpone dealing with this transpilation until we actually need it.

@georgesittas georgesittas deleted the transpile_TO_DOUBLE_snowflake_duckdb branch January 13, 2026 15:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants